Beverly
Development of an Intuitive GUI for Non-Expert Teleoperation of Humanoid Robots
Barret, Austin, Lau, Meng Cheng
The operation of humanoid robotics is an essential field of research with many practical and competitive applications. Many of these systems, however, do not invest heavily in developing a non-expert-centered graphical user interface (GUI) for operation. The focus of this research is to develop a scalable GUI that is tailored to be simple and intuitive so non-expert operators can control the robot through a FIRA-regulated obstacle course. Using common practices from user interface development (UI) and understanding concepts described in human-robot interaction (HRI) and other related concepts, we will develop a new interface with the goal of a non-expert teleoperation system.
Deep Learning-Based Identification of Inconsistent Method Names: How Far Are We?
Wang, Taiming, Zhang, Yuxia, Jiang, Lin, Tang, Yi, Li, Guangjie, Liu, Hui
Concise and meaningful method names are crucial for program comprehension and maintenance. However, method names may become inconsistent with their corresponding implementations, causing confusion and errors. Several deep learning (DL)-based approaches have been proposed to identify such inconsistencies, with initial evaluations showing promising results. However, these evaluations typically use a balanced dataset, where the number of inconsistent and consistent names are equal. This setup, along with flawed dataset construction, leads to false positives, making reported performance less reliable in real-world scenarios, where most method names are consistent. In this paper, we present an empirical study that evaluates state-of-the-art DL-based methods for identifying inconsistent method names. We create a new benchmark by combining automatic identification from commit histories and manual developer inspections, reducing false positives. We evaluate five representative DL approaches (one retrieval-based and four generation-based) on this benchmark. Our results show that performance drops substantially when moving from the balanced dataset to the new benchmark. We further conduct quantitative and qualitative analyses to understand the strengths and weaknesses of the approaches. Retrieval-based methods perform well on simple methods and those with popular name sub-tokens but fail due to inefficient representation techniques. Generation-based methods struggle with inaccurate similarity calculations and immature name generation. Based on these findings, we propose improvements using contrastive learning and large language models (LLMs). Our study suggests that significant improvements are needed before these DL approaches can be effectively applied to real-world software systems.
- Asia > China > Beijing > Beijing (0.04)
- Europe > Spain > Galicia > Madrid (0.04)
- North America > United States > Nevada (0.04)
- (8 more...)
- Research Report > New Finding (1.00)
- Research Report > Experimental Study (1.00)
Generative Visual Communication in the Era of Vision-Language Models
Visual communication, dating back to prehistoric cave paintings, is the use of visual elements to convey ideas and information. In today's visually saturated world, effective design demands an understanding of graphic design principles, visual storytelling, human psychology, and the ability to distill complex information into clear visuals. This dissertation explores how recent advancements in vision-language models (VLMs) can be leveraged to automate the creation of effective visual communication designs. Although generative models have made great progress in generating images from text, they still struggle to simplify complex ideas into clear, abstract visuals and are constrained by pixel-based outputs, which lack flexibility for many design tasks. To address these challenges, we constrain the models' operational space and introduce task-specific regularizations. We explore various aspects of visual communication, namely, sketches and visual abstraction, typography, animation, and visual inspiration.
- North America > United States > California > Los Angeles County > Los Angeles (0.14)
- North America > United States > California > San Francisco County > San Francisco (0.13)
- Asia > China > Beijing > Beijing (0.04)
- (22 more...)
- Research Report > Promising Solution (1.00)
- Questionnaire & Opinion Survey (1.00)
- Overview (1.00)
- Research Report > New Finding (0.92)
- Information Technology > Artificial Intelligence > Natural Language > Text Processing (1.00)
- Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.92)
- Information Technology > Artificial Intelligence > Vision > Image Understanding (0.67)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning > Generative AI (0.45)
ESALE: Enhancing Code-Summary Alignment Learning for Source Code Summarization
Fang, Chunrong, Sun, Weisong, Chen, Yuchen, Chen, Xiao, Wei, Zhao, Zhang, Quanjun, You, Yudu, Luo, Bin, Liu, Yang, Chen, Zhenyu
(Source) code summarization aims to automatically generate succinct natural language summaries for given code snippets. Such summaries play a significant role in promoting developers to understand and maintain code. Inspired by neural machine translation, deep learning-based code summarization techniques widely adopt an encoder-decoder framework, where the encoder transforms given code snippets into context vectors, and the decoder decodes context vectors into summaries. Recently, large-scale pre-trained models for source code are equipped with encoders capable of producing general context vectors and have achieved substantial improvements on code summarization. However, although they are usually trained mainly on code-focused tasks and can capture general code features, they still fall short in capturing specific features that need to be summarized. This paper proposes a novel approach to improve code summarization based on summary-focused tasks. Specifically, we exploit a multi-task learning paradigm to train the encoder on three summary-focused tasks to enhance its ability to learn code-summary alignment, including unidirectional language modeling (ULM), masked language modeling (MLM), and action word prediction (AWP). Unlike pre-trained models that mainly predict masked tokens in code snippets, we design ULM and MLM to predict masked words in summaries. Intuitively, predicting words based on given code snippets would help learn the code-summary alignment. Additionally, we introduce the domain-specific task AWP to enhance the ability of the encoder to learn the alignment between action words and code snippets. The extensive experiments on four datasets demonstrate that our approach, called ESALE significantly outperforms baselines in all three widely used metrics, including BLEU, METEOR, and ROUGE-L.
- North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)
- North America > United States > California > San Francisco County > San Francisco (0.14)
- Oceania > Australia > Victoria > Melbourne (0.04)
- (33 more...)
- Research Report > New Finding (0.68)
- Research Report > Experimental Study (0.46)
- Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
- Information Technology > Artificial Intelligence > Natural Language > Machine Translation (0.88)
- Information Technology > Artificial Intelligence > Natural Language > Chatbot (0.86)
Quantum-tunnelling deep neural networks for sociophysical neuromorphic AI
The discovery of the quantum tunnelling effect -- the transmission of particles through a high potential barrier -- was one of the most impressive achievements of quantum mechanics made in the 1920s. Responding to the contemporary challenges, I introduce a novel deep neural network (DNN) architecture that processes information using the effect of quantum tunnelling. I demonstrate the ability of the quantum tunnelling DNN (QT-DNN) to recognise optical illusions like a human. Hardware implementation of QT-DNN is expected to result in an inexpensive and energy-efficient neuromorphic chip suitable for applications in autonomous vehicles. The optical illusions recognition tests developed in this paper should lay foundations for cognitive benchmarking tasks for AI systems of the future, benefiting the fields of sociophysics and behavioural science.
- Oceania > Australia (0.04)
- North America > United States > New York > New York County > New York City (0.04)
- North America > United States > New Jersey (0.04)
- (5 more...)
CodeScope: An Execution-based Multilingual Multitask Multidimensional Benchmark for Evaluating LLMs on Code Understanding and Generation
Yan, Weixiang, Liu, Haitian, Wang, Yunkun, Li, Yunzhe, Chen, Qian, Wang, Wen, Lin, Tingyu, Zhao, Weishan, Zhu, Li, Deng, Shuiguang, Sundaram, Hari
Large Language Models (LLMs) have demonstrated remarkable performance on coding related tasks, particularly on assisting humans in programming and facilitating programming automation. However, existing benchmarks for evaluating the code understanding and generation capacities of LLMs suffer from severe limitations. First, most benchmarks are deficient as they focus on a narrow range of popular programming languages and specific tasks, whereas the real-world software development scenarios show dire need to implement systems with multilingual programming environments to satisfy diverse requirements. Practical programming practices also strongly expect multi-task settings for testing coding capabilities of LLMs comprehensively and robustly. Second, most benchmarks also fail to consider the actual executability and the consistency of execution results of the generated code. To bridge these gaps between existing benchmarks and expectations from practical applications, we introduce CodeScope, an execution-based, multilingual, multi-task, multi-dimensional evaluation benchmark for comprehensively gauging LLM capabilities on coding tasks. CodeScope covers 43 programming languages and 8 coding tasks. It evaluates the coding performance of LLMs from three dimensions (perspectives): difficulty, efficiency, and length. To facilitate execution-based evaluations of code generation, we develop MultiCodeEngine, an automated code execution engine that supports 14 programming languages. Finally, we systematically evaluate and analyze 8 mainstream LLMs on CodeScope tasks and demonstrate the superior breadth and challenges of CodeScope for evaluating LLMs on code understanding and generation tasks compared to other benchmarks. The CodeScope benchmark and datasets are publicly available at https://github.com/WeixiangYAN/CodeScope.
- North America > Canada > British Columbia > Metro Vancouver Regional District > Vancouver (0.14)
- North America > United States > California > San Francisco County > San Francisco (0.14)
- North America > United States > Pennsylvania > Allegheny County > Pittsburgh (0.04)
- (25 more...)
Physical Computing: A Category Theoretic Perspective on Physical Computation and System Compositionality
Dehghani, Nima, Caterina, Gianluca
This paper introduces a category theory-based framework to redefine physical computing in light of advancements in quantum computing and non-standard computing systems. By integrating classical definitions within this broader perspective, the paper rigorously recontextualizes what constitutes physical computing devices and processes. It demonstrates how the compositional nature and relational structures of physical computing systems can be coherently formalized using category theory. This approach not only encapsulates recent formalisms in physical computing but also offers a structured method to explore the dynamic interactions within these systems. Keywords: Computation, Computability, Physical Computation, Category Theory 1. Introduction Roots of computability trace back to Leibniz, who invented a mechanical calculator for automating the evaluation of mathematical expressions [23, 20, 9]. At the Paris international conference for mathematics, David Hilbert extended Leibniz's fascination by proposing the Entscheidungsproblem (the decision problem), questioning the existence of an "effective procedure" to determine the truth or falsity of mathematical statements [29]. Alan Turing and Alonso Church independently demonstrated the impossibility of resolving the Entscheidungsproblem. This discovery, known as the "Church-Turing thesis", posited that no effective procedure (or "algorithm" in contemporary terms) can Present address: McGovern Institute for Brain Research, MIT, Cambridge, 02139, MA, USA Prior to the emergence of algorithm(s), procedural calculation through a finite number of exact, finite instructions was labeled "effective procedure (or effective calculation).
- North America > United States > New York (0.04)
- Europe > United Kingdom > England > Oxfordshire > Oxford (0.04)
- North America > United States > Massachusetts > Essex County > Beverly (0.04)
- (2 more...)
A Prompt Learning Framework for Source Code Summarization
Sun, Weisong, Fang, Chunrong, You, Yudu, Chen, Yuchen, Liu, Yi, Wang, Chong, Zhang, Jian, Zhang, Quanjun, Qian, Hanwei, Zhao, Wei, Liu, Yang, Chen, Zhenyu
(Source) code summarization is the task of automatically generating natural language summaries for given code snippets. Such summaries play a key role in helping developers understand and maintain source code. Recently, with the successful application of large language models (LLMs) in numerous fields, software engineering researchers have also attempted to adapt LLMs to solve code summarization tasks. The main adaptation schemes include instruction prompting and task-oriented fine-tuning. However, instruction prompting involves designing crafted prompts for zero-shot learning or selecting appropriate samples for few-shot learning and requires users to have professional domain knowledge, while task-oriented fine-tuning requires high training costs. In this paper, we propose a novel prompt learning framework for code summarization called PromptCS. PromptCS trains a prompt agent that can generate continuous prompts to unleash the potential for LLMs in code summarization. Compared to the human-written discrete prompt, the continuous prompts are produced under the guidance of LLMs and are therefore easier to understand by LLMs. PromptCS freezes the parameters of LLMs when training the prompt agent, which can greatly reduce the requirements for training resources. We evaluate PromptCS on the CodeSearchNet dataset involving multiple programming languages. The results show that PromptCS significantly outperforms instruction prompting schemes on all four widely used metrics. In some base LLMs, e.g., CodeGen-Multi-2B and StarCoderBase-1B and -3B, PromptCS even outperforms the task-oriented fine-tuning scheme. More importantly, the training efficiency of PromptCS is faster than the task-oriented fine-tuning scheme, with a more pronounced advantage on larger LLMs. The results of the human evaluation demonstrate that PromptCS can generate more good summaries compared to baselines.
- North America > United States > California > San Francisco County > San Francisco (0.14)
- Asia > China > Jiangsu Province > Nanjing (0.05)
- Asia > Singapore (0.05)
- (28 more...)
An Extractive-and-Abstractive Framework for Source Code Summarization
Sun, Weisong, Fang, Chunrong, Chen, Yuchen, Zhang, Quanjun, Tao, Guanhong, Han, Tingxu, Ge, Yifei, You, Yudu, Luo, Bin
(Source) Code summarization aims to automatically generate summaries/comments for a given code snippet in the form of natural language. Such summaries play a key role in helping developers understand and maintain source code. Existing code summarization techniques can be categorized into extractive methods and abstractive methods. The extractive methods extract a subset of important statements and keywords from the code snippet using retrieval techniques, and generate a summary that preserves factual details in important statements and keywords. However, such a subset may miss identifier or entity naming, and consequently, the naturalness of generated summary is usually poor. The abstractive methods can generate human-written-like summaries leveraging encoder-decoder models from the neural machine translation domain. The generated summaries however often miss important factual details. To generate human-written-like summaries with preserved factual details, we propose a novel extractive-and-abstractive framework. The extractive module in the framework performs a task of extractive code summarization, which takes in the code snippet and predicts important statements containing key factual details. The abstractive module in the framework performs a task of abstractive code summarization, which takes in the entire code snippet and important statements in parallel and generates a succinct and human-written-like natural language summary. We evaluate the effectiveness of our technique, called EACS, by conducting extensive experiments on three datasets involving six programming languages. Experimental results show that EACS significantly outperforms state-of-the-art techniques in terms of all three widely used metrics, including BLEU, METEOR, and ROUGH-L.
- North America > United States > California > San Francisco County > San Francisco (0.28)
- North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)
- North America > Canada > British Columbia > Metro Vancouver Regional District > Vancouver (0.14)
- (38 more...)
Automatic Code Summarization via ChatGPT: How Far Are We?
Sun, Weisong, Fang, Chunrong, You, Yudu, Miao, Yun, Liu, Yi, Li, Yuekang, Deng, Gelei, Huang, Shenghan, Chen, Yuchen, Zhang, Quanjun, Qian, Hanwei, Liu, Yang, Chen, Zhenyu
To support software developers in understanding and maintaining programs, various automatic code summarization techniques have been proposed to generate a concise natural language comment for a given code snippet. Recently, the emergence of large language models (LLMs) has led to a great boost in the performance of natural language processing tasks. Among them, ChatGPT is the most popular one which has attracted wide attention from the software engineering community. However, it still remains unclear how ChatGPT performs in (automatic) code summarization. Therefore, in this paper, we focus on evaluating ChatGPT on a widely-used Python dataset called CSN-Python and comparing it with several state-of-the-art (SOTA) code summarization models. Specifically, we first explore an appropriate prompt to guide ChatGPT to generate in-distribution comments. Then, we use such a prompt to ask ChatGPT to generate comments for all code snippets in the CSN-Python test set. We adopt three widely-used metrics (including BLEU, METEOR, and ROUGE-L) to measure the quality of the comments generated by ChatGPT and SOTA models (including NCS, CodeBERT, and CodeT5). The experimental results show that in terms of BLEU and ROUGE-L, ChatGPT's code summarization performance is significantly worse than all three SOTA models. We also present some cases and discuss the advantages and disadvantages of ChatGPT in code summarization. Based on the findings, we outline several open challenges and opportunities in ChatGPT-based code summarization.
- North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)
- North America > United States > California > San Francisco County > San Francisco (0.14)
- Europe > Spain > Galicia > Madrid (0.04)
- (28 more...)